Exploiting Morpheme and Cross-lingual Knowledge to Enhance Mongolian Named Entity Recognition

نویسندگان

چکیده

Mongolian named entity recognition (NER) is not only one of the most crucial and fundamental tasks in natural language processing, but also an important step to improve performance downstream such as information retrieval, machine translation, dialog system. However, traditional NER models heavily rely on feature engineering. Even worse, complex morphological structure words makes data sparser. To alleviate engineering sparsity recognition, we propose a novel framework with Multi-Knowledge Enhancement (MKE-NER) . Specifically, introduce both linguistic knowledge through morpheme representation cross-lingual from Mongolian-Chinese parallel corpus. Furthermore, design two methods exploit sufficiently, i.e., annotation projection. Experimental results demonstrate effectiveness our MKE-NER model, which outperforms strong baselines achieves best (94.04% F1 score) benchmark. Particularly, extensive experiments different scales highlight superiority method low-resource scenarios.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Lingual Named Entity Recognition via Wikification

Named Entity Recognition (NER) models for language L are typically trained using annotated data in that language. We study cross-lingual NER, where a model for NER in L is trained on another, source, language (or multiple source languages). We introduce a language independent method for NER, building on cross-lingual wikification, a technique that grounds words and phrases in nonEnglish text in...

متن کامل

Cheap Translation for Cross-Lingual Named Entity Recognition

Recent work in NLP has attempted to deal with low-resource languages but still assumed a resource level that is not present for most languages, e.g., the availability of Wikipedia in the target language. We propose a simple method for crosslingual named entity recognition (NER) that works well in settings with very minimal resources. Our approach makes use of a lexicon to “translate” annotated ...

متن کامل

Cross-lingual named entity extraction and disambiguation

We propose a method for the task of identifying and disambiguation of named entities in a scenario where the language of the input text differs from the language of the knowledge base. We demonstrate this functionality on English and Slovene named entity disambiguation

متن کامل

Exploiting Wikipedia as External Knowledge for Named Entity Recognition

We explore the use of Wikipedia as external knowledge to improve named entity recognition (NER). Our method retrieves the corresponding Wikipedia entry for each candidate word sequence and extracts a category label from the first sentence of the entry, which can be thought of as a definition part. These category labels are used as features in a CRF-based NE tagger. We demonstrate using the CoNL...

متن کامل

Exploiting entity-level morphology to Chinese nested named entity recognition

Named entity recognition plays an important role in many natural language processing applications. While considerable attention has been pain in the past to research issues related to named entity recognition, few studies have been reported on the recognition of nested named entities. This paper presents a morpheme-based due-layer labeling method to Chinese nested named entity recognition. To a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Asian and Low-Resource Language Information Processing

سال: 2022

ISSN: ['2375-4699', '2375-4702']

DOI: https://doi.org/10.1145/3511098